□□Documentation.html□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□TEXTMOSS□□□□□□□□□□□Oニ□□□ョュス Gュス G□□□□□□□□□□□□□□□□□□□□□□□=ïc□ HTML Markdown 2.0 Documentation

Welcome to HTML Markdown 2.0

with URL spitter

HTML Markdown is a small, fast, efficient way to convert Web pages into regular, readable, printable text. This document will help you use HTML Markdown 2.0 more effectively.


Table of Contents


Introduction

HTML Markdown is a Macintosh drag-and-drop application that allows you to convert HTML documents into plain text files. Version 2.0 also allows you to "spit" all of the URL's in a group of files into a single text file. There are several reasons why users may want to convert web pages to text. Among them are:
  1. Makes an easy way to archive and print your web pages
  2. Web pages transferred through email and ftp sometimes retain tags, making them difficult to read
  3. Users of the Lynx browser often have trouble saving web pages as text files and need to strip away the HTML tags
  4. Other browsers that print web pages include all the graphics, which just take up space if all you're interested in is the content
  5. People who want to change the look of their web sites but need to keep the same content can use HTML Markdown to "start over" without losing what's there
  6. Perhaps you want to spell check your HTML file and your word processor chokes on HTML

Installing HTML Markdown 2.0

If you're reading this file, then you have already successfully downloaded and decompressed the HTML Markdown archive. To use HTML Markdown effectively, you should put it on a hard drive. If it is not already on your hard drive, copy it over. HTML Markdown 2.0 is now installed.


Using HTML Markdown 2.0

HTML Markdown is very easy to use. There are three basic steps:
  1. Drag the files to convert on the HTML Markdown icon
  2. Select the appropriate options from the Job Ticket
  3. Click "Go!"
Since HTML Markdown is a drag-and-drop-only application, you must drag your files onto the icon in the Finder to convert them. If the Finder doesn't allow you to convert the files this way, you probably need to rebuild your desktop file. To do this, restart your computer. When you are almost at the desktop, hold down option and command.

When you drag files on the HTML Markdown icon, you are given a "job ticket" where you can select conversion options. The following options are available:


Common Problems Solved

I drag a file onto the HTML Markdown icon and nothing happens!

There are two likely reasons why this is happening. Both are pretty simple to fix. The first possibility is that the file you are trying to convert is not an HTML file. HTML Markdown only works on HTML files.

The second possibility is that your desktop file needs to be rebuilt. The desktop file is an invisible file on your hard drive that tells the computer which programs can open which files. If the desktop file isn't rebuild once and a while, the computer gets a bit confused. To rebuild your desktop file, restart your computer. When you're almost at the desktop, hold down the command and option keys. The Mac will ask you if you want to rebuild your desktop file. You do, so click OK.

There's actually a third case, but it's very rare. Sometimes the desktop file gets corrupted and even rebuilding it doesn't do the trick. In this case, you have to make this invisible file visible and then throw it in the trash. You can use ResEdit to do it, but if you've never used ResEdit before, I wouldn't recommend it. Please contact me if the first two solutions don't work.


HTML Markdown doesn't convert my files correctly

The way HTML Markdown works is basically that it copies the HTML file into a new file and skips everything between < and >. It's actually quite a bit more complicated than that, but this should suffice for an explanation. So if your file follows the guideline that everything that's HTML comes between < and >, your file should be converted fine.

Potential problems come in only with conversions -- that is, converting the "alt" option in img tags, <HR>'s, etc. If your HTML is invalid (for example, you don't have a close quote for an alt), HTML Markdown will ignore that tag completely. This will have the effect of looking like HTML Markdown missed it. In reality, it was just skipping over things that were not valid. Many browsers can view incorrect HTML. HTML Markdown can usually deal with it, but requires strict adherence to the rules with img and href tags.


Questions and Answers

Q: Who are you?

A: My name is Scott J. Kleper. I'm a freshman at Stanford University, originally from Pittsford, New York. I'm 18 years old and I've been programming Macs for about four years. I've been writing shareware for almost as long. My other programs include HTML Markup, MacFolklore, Jot, and PowerSpaceTabsPlus. All are available from my home page at http://htc.rit.edu/scott.html


Q: Why did you write HTML Markdown?

A: When I first obtained web access in 1995, I only had a dialup shell account. I still thought the web was great and had lots of fun downloading jokes, song lyrics, etc. The problem was that lynx always saved the files as HTML instead of TEXT (there's probably now a way to do this in lynx or maybe I just didn't know how). I wrote HTML Markdown to just go through the file and remove anything between < and >. At the time, I didn't know much about HTML so version 1.0 didn't even recognize special characters. Version 2.0 should recognize all of the ISO Latin Character Set and has a much richer feature set in general.


Q: Aren't there other similar programs?

A: Surprisingly, there really aren't too many. There are plenty that convert from some format to HTML, but only a couple that convert from HTML to another format. As far as I know, there are no other HTML to text converters that can do what HTML Markdown does.


Q: Why ASCII text? Why not Styled text or RTF?

A: I had originally planned version 2.0 to have an option to convert HTML into RTF. This would essentially make HTML Markdown an HTML importer for Microsoft Word. Just as the RTF functionality was entering initial testing, Microsoft released Internet Assistant for Word 6. It basically did what I just described, so there was no sense in continuing. I am considering RTF and/or Styled Text for future versions because Microsoft's IA cannot do batch conversion. However, for now I am concentrating on plain text.


Q: What good is the URL Spitter?

A: Let's say you have your entire site in a folder on your Mac. Maybe one link has expired that's listed all over or maybe you just want to see all of the links in your site to get an idea of how they map out and link to each other. The URL spitter will munge through all of the files and copy the URLs into a separate file, indicating which file each URL came from. In my opinion, it's the coolest new feature in version 2.0.


Q: Where's support for advanced HTML stuff like tables and forms?

A: At its most basic level, HTML Markdown just gets rid of HTML. It was not designed to interpret what is in the HTML file except in limited cases. So the program doesn't attempt to figure out if it's looking at a table or a form or whatever. It just gets rid of the HTML and cleans things up. Tables should still look okay after converted, though they won't be beautiful. Forms will retain their text, but not the fields.


Q: Where's support for AppleScript and other new Apple technologies? Why is it drag-and-drop only?

A: I think AppleScript is a bit beyond the scope of HTML Markdown. It's meant to be a quick converter, and I don't see where scripting would be a big benefit. The program is drag-and-drop only because that is the easiest way to convert a group of files -- just drag them to the icon. My other HTML converter, HTML Markup was a drag-and-drop only program until version 2.0 was released in March. HTML Markdown will always be about 1 version behind its cousin as far as the interface goes. In other words, the next version of HTML Markdown (if there is one) will have a full interface, similar to HTML Markup 2.0.


Q: I work for a user group, CD-ROM publisher, or magazine. Can we give out your program?

A: If you're going to be distributing it electronically or on disk, you may distribute HTML Markdown without notifying me. However, I still request that you send me a quick note letting me know where it's going to be distributed. If you are going to write an article or review of HTML Markdown, I really really really want to read it. Please please please send me a copy of the article. If you're going to be distributing HTML Markdown 2.0 on a CD-ROM, you must notify me first. I will generally grant permission for distribution, but I want to know which CD it's going to be on. Send permission requests, reviews, etc. to:

Scott J. Kleper
Attn: HTML Markdown 2.0
klep@cs.stanford.edu
134 Caversham Woods
Pittsford, NY 14534-2834
USA


Q: I want to learn more about HTML in general. Where can I look?

A: I learned HTML mainly from online resources. I'd check www.yahoo.com for resources. If you'd prefer a book, I'd recommend "Internet Power Web" by Pacific HiTech for a very quick and basic introduction to web publishing.


Q: Why do you put that credit line in all my files? Just to annoy me into giving you money?

A: It's not to annoy you at all, just to remind you that you are using a shareware program that I put a lot of work into and I would really appreciate it if you would help me out by paying your shareware fee. Feel free to use HTML Markdown for 30 days and try it out. If you use it, you are obligated to pay for it (see below).


Q: Is HTML Markdown really that much faster since it's PowerPC native?

A: Nope, not really. It's interesting and I've done many tests to find out exactly how much faster the PowerPC version is. Basically, if you're running Speed Doubler anyways, HTML Markdown 2.0 runs at about the same speed regardless of whether you run the 68k version in emulation or the native PowerPC version. The difference shows up when you start converting hundreds of files or complex files. When you start to get up in that range, the PowerPC version is up to 200% faster (or more). The reason is because the real factor that slows down conversion is saving and opening the files. The File Manager isn't PowerPC native yet, so it's still slow regardless of whether or not the program is native. So there's some speed boost, but it's not always obvious.


Q: What's going to happen to HTML Markdown in the future?

A: There are lots of features that I'd like to include. I probably won't start the next version right away, unless there are bug fixes to do. But please continue to send me ideas and I'll do what I can to include the features you want in future versions of HTML Markdown.


Q: Can I have the source code so I can write my own converter?

A: No. HTML Markdown 2.0 is a copyrighted product and I will not release the source code. However, I am sympathetic towards other people working on similar projects and I'd be happy to discuss techniques or provide help with your program. Send me email.


Registering HTML Markdown

Writing shareware is a lot of work, especially for a college student. Considering all the time that HTML Markdown can save you, I request that if you find it useful, you pay your shareware fee.

If you do decide to register HTML Markdown, you will receive the registered version, which will no longer put the shareware message in your files or display the shareware alert when you're converting files.

Registration costs $10 for a single-user license. This means that one person can use HTML Markdown 2.0 on one Macintosh. If you're in a multi-user environment, or want to have HTML Markdown 2.0 installed on multiple machines, you may purchase multiple licenses at $10 each or a site license for $200. If you are a registered user of HTML Markdown, you may upgrade to version 2.0 for free if you can receive the new version through email or for a self-addressed stamped envelope if you want it through the US Mail. If you decide to upgrade, you must do so directly through me. Send email to klep@cs.stanford.edu

The preferred way of registering is to send me a check directly. Make your $10 check payable to Scott J. Kleper and please include your email address if possible. I can also take traveler's cheques, money orders, and international money orders in US dollars.
Send checks to:

Scott J. Kleper
134 Caversham Woods
Pittsford, NY 14534-2834
USA

The alternate way of paying is through the Kagi shareware payment service. Paying through Kagi allows you to use a check, credit card, or electronic payment. To register through Kagi, just double-click the "Register KlepHacks" program that came with HTML Markdown 2.0. If you didn't get the Register program when you downloaded it, it is available from:

ftp://htc.rit.edu/pub/register-klephacks.hqx

Again, please include your email address so that I can send the registered version to you as quickly as possible.

One last thing about registering. I really appreciate it when people send comments about and suggestions for the program. I read them all and if you supply an email address, I'll probably contact you. If you use the Kagi method, you can still send comments.


About the Author

I'm currently a freshman at Stanford University, where I am a Computer Science major. I've been using the Macintosh since 1984, when I was just tall enough to reach the keyboard and just curious enough to wonder why the screen wasn't color like all other computers. I started programming in BASIC when I was in third grade and I've never quite gotten over it. Sometimes I have nightmares about GOTO's. When I was a freshman in high school, I got my first Mac of my very own, an LC. I dug up an old copy of Microsoft BASIC for Macintosh, a terrible implementation of a terrible language. I still wanted to program Macs though, so I took a few classes in Pascal at the Rochester Institute of Technology when I was a sophomore in high school. I came to Stanford for a summer program in 1993, where I learned how to program in C. At about the same time, I started reading Mac programming books to learn some of the toolbox calls. My first programs were pretty bad. I hacked together a really pathetic paint program in a few weeks and called it ScottPaint. I started a game after that was called Reality Sucks, but never got very far with it. My first real Macintosh program was actually begun as an attempt to learn how to display text and receive input from users for Reality Sucks. I put together a little text editor with the old-style TextEdit records (assuming falsely that they'd be easier to learn) and gradually added features to it. That project became known as Jot, which I worked on for about a year. During my senior year in high school, I wrote a bunch of little programs like HTML Markup, which converts text into HTML, HTML Markdown 1.0, PowerSpaceTabsPlus, which converts strings of spaces into a single tab, and MacFolklore, which quizzes you on the history of the Macintosh.

If you have any question, feel free to send me mail at klep@cs.stanford.edu


Special Thanks

Many people have helped me with this project and I'd like to take this opportunity to thank as many of them as I can think of. I'd like to thank my dad, Michael Kleper, for raising me on Macs instead of PCs. I'd like to thank Randy Stahl for getting me started with Mac programming. I'd like to thank Julie Zelenski, my CS prof, for making me figure out why things crash, and not just making them work most of the time. I'd like to thank all the users of HTML Markdown for sending in suggestions and registering the program.

This program has gone through a rather extensive beta testing period by a small number of dedicated beta users. Each have contributed to the program. Most of the graphics were done by Dave Laster, including the splash screen and icon.

I'd also like to thank my mom. Thanks, mom!


Version History

HTML Markdown 1.0 (date unknown)
First release.

HTML Markdown 1.0.1 (6/95)
Fixed a bug that would cause the file to contain extra stuff if it is saved with the same name as the original. This is caused by not setting the end of file flag properly (or at all!).

HTML Markdown 2.0 (5/96)
Total re-write. Added options for smart conversion, converts escape codes, added URL spitter, more reliable, converts bigger files, faster, uses less memory.


Legal Stuff

HTML Markdown 2.0 is ©opyright 1996, by Scott J. Kleper. You may copy and distribute the SHAREWARE version of HTML Markdown 2.0 as long as you include all the documentation and everything that came with it when you downloaded it. This program is shareware. If you use it, you are obligated to pay for it. See above for payment/registration information.

No warranty is included with this program. Use it at your own risk. There are no known bugs with this program. However, the author is not responsible for any problems caused by it.

This program may be included in online file areas and archives. It may be distributed through user groups and shared with other users. CD-ROM publishers MUST contact me at klep@cs.stanford.edu before including HTML Markdown 2.0 on a CD-ROM product.

If you would like to review HTML Markdown for an online or traditional magazine, please contact me so that I can see a copy. □□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□h□□□h□□F□ケ/□Mobius Strip (24,6)□Ÿ□□□セ–:□Documentation.htmln□□□TEXTMOSS□TEXTMOSS□□@□□□□□□□□□□□\□□□□□□ュキヨ&□Oニ□□□ョon)セ–:□シン"□□□イ%□□ケ‘{5,3,3} (H-Dodecahedron)セ–:□シン"□□□;%□□ケ‘□From E.C. Computeredron)セ–:□シン"□□□D%□□ケ‘□Voting Instructionsdron)セ□□H⇥Monacoホ□†□ヘミR□ヘ„™□ヘ鴟□□□□□□□□□□□□□□)□□□ネ□□)□□□ネ□□ュシメ□□9J□9J□□□ü□□□□□□R*ch\□□□□HH□□□□□レ□(…á…á□ù□E□G□(□ü□□□□HH□□□□□レ□(□□□□d□□□□□□□□□□□□'□□□□□□□□□□□□□□□□h□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□Monaco˜□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□⇥□□□□ Helvetica□M□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□Confidential□ H□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□\□□\□□\□□\□□□□□□□□□□□□□□□□□H□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□h□□□h□□F□ヘíœ□ê□□□□F□□MPSR□□□□BBST□□□□□í……□□□□□□□□\……□□L□ヘ鴟□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□